-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-37093: [Python] Add async Flight client with GetFlightInfo #36986
GH-37093: [Python] Add async Flight client with GetFlightInfo #36986
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
25fa26f
to
1bf01ae
Compare
CC @pitrou @jorisvandenbossche this is a skeleton of how to bind async C++ code to asyncio in Python, though I still need to get it working with both asyncio and Trio. (I'm looking at https://github.com/codypiersall/pynng/blob/master/pynng/_aio.py as a reference of how we might do that.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good start to me.
python/pyarrow/_flight.pyx
Outdated
@@ -1320,6 +1399,11 @@ cdef class FlightClient(_Weakrefable): | |||
check_flight_status(CFlightClient.Connect(c_location, c_options | |||
).Value(&self.client)) | |||
|
|||
def as_async(self) -> None: | |||
if anyio is None: | |||
raise RuntimeError("anyio is required for the async interface") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we would want anyio
. Just start with plain asyncio
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main thing is to support both asyncio and Trio users (though I think anyio is unnecessary and we can just directly support them both)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we want that, though? asyncio is the de facto standard. And the anyio API for delegating stuff to external threads seems rather convoluted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a reason to just lock out a part of the ecosystem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And FWIW, one of the very first issues opened against ADBC was request for async support via anyio specifically, to support both Trio and asyncio
1bf01ae
to
a89fe71
Compare
Updated, added basic tests. I'll deal with Trio in a later PR, as well as raising an RpcError-type exception instead of built in exceptions (again: RPC errors should be distinct, and should be handled uniformly), once I've reviewed the C++ side and decided I'm happy with TransportStatusDetail. There's more we can do with e.g. pytest-asyncio and such (down the line, we'd want all tests to be parametrized on asyncio vs Trio) that I don't think is worth tackling immediately. |
a89fe71
to
68f2aa0
Compare
Hmm, that's an interesting segfault in CI. |
Alright:
|
So, I don't understand why, but bouncing through a pure-Python thread fixes the segfault. It seems something about having the C-callback in the traceback at some point borks the interpreter. It does seem that the asyncio implementation tries to get the traceback unconditionally (in call_soon_threadsafe -> call_soon -> asyncio.events.Handle) so I suppose maybe Cython isn't giving a valid traceback object for No clue about the first error... |
Can the segfault be reproduced locally? |
Yes, just run the test with PYTHONDEVMODE. The other error only seems to reproduce under Docker though I haven't tried very hard so far. |
Ok, I'll take a look hopefully on Monday. |
Looking at these:
It looks like Cython indeed doesn't generate stack frames for calls, and also modifying |
0cc328e
to
4154e4f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot. This really opens interesting possibilities, I think.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
async def get_flight_info( | ||
self, | ||
descriptor: FlightDescriptor, | ||
*, | ||
options: FlightCallOptions = None, | ||
): | ||
call = AsyncioCall() | ||
self._get_flight_info(call, descriptor, options) | ||
return await call.as_awaitable() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could just be:
async def get_flight_info( | |
self, | |
descriptor: FlightDescriptor, | |
*, | |
options: FlightCallOptions = None, | |
): | |
call = AsyncioCall() | |
self._get_flight_info(call, descriptor, options) | |
return await call.as_awaitable() | |
def get_flight_info( | |
self, | |
descriptor: FlightDescriptor, | |
*, | |
options: FlightCallOptions = None, | |
): | |
call = AsyncioCall() | |
self._get_flight_info(call, descriptor, options) | |
return call.as_awaitable() |
but I've kept the explicit async def
for introspection and self-documentation... what do you think @jorisvandenbossche ?
At least
We could perhaps expose |
That said, it would perhaps be nicer to expose the same error message in Python as in C++... |
@github-actions crossbow submit -g python |
/// This is like supports_async(), except that a detailed error message | ||
/// is returned if async support is not available. If async support is | ||
/// available, this function returns successfully. | ||
Status CheckAsyncSupport() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lidavidm Does this API look ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though I would be happy to replace supports_async
entirely with this.
Revision: 58a938a Submitted crossbow builds: ursacomputing/crossbow @ actions-b4276e2530 |
CI failures are unrelated. |
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit b6ab909. There were 4 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them. |
…pache#36986) ### Rationale for this change Demonstrate how to deal with async Flight from Python. ### What changes are included in this PR? Add bindings for GetFlightInfo. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes, new APIs. * Closes: apache#37093 Lead-authored-by: David Li <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
Rationale for this change
Demonstrate how to deal with async Flight from Python.
What changes are included in this PR?
Add bindings for GetFlightInfo.
Are these changes tested?
Yes.
Are there any user-facing changes?
Yes, new APIs.